[SC-7864] Create credit risk scorecard notebook using XGBoost by juanmleng · Pull Request #280 · validmind/validmind-library

juanmleng · 2025-01-02T14:04:44Z

Internal Notes for Reviewers

Add new application scorecard notebooks using ML with additional testing:

application_scorecard_with_ml.ipynb: running individual tests
application_scorecard_full_suite: using run_documentation_tests()

External Release Notes

Add new application scorecard notebooks using ML with additional testing:

application_scorecard_with_ml.ipynb: running individual tests
application_scorecard_full_suite: using run_documentation_tests()

…ptions

johnwalz97

lgtm

github-actions · 2025-01-03T22:24:37Z

PR Summary

This pull request introduces several enhancements and bug fixes to the ValidMind Library, particularly focusing on credit risk scorecard modeling. The key changes include:

New Notebooks: Two new Jupyter notebooks have been added to demonstrate the application scorecard model using the ValidMind Library. These notebooks provide a step-by-step guide for loading a demo dataset, preprocessing data, training models, and documenting the model using ValidMind.
New Tests: Several new tests have been added to the validmind/tests directory, including:
- MutualInformation: Evaluates feature relevance by calculating mutual information scores between features and the target variable.
- ScoreBandDefaultRates: Analyzes default rates and population distribution across credit score bands.
- CalibrationCurve: Assesses the calibration of probability estimates by comparing predicted probabilities against observed frequencies.
- ClassifierThresholdOptimization: Analyzes and visualizes different threshold optimization methods for binary classification models.
- ModelParameters: Extracts and displays model parameters for transparency and reproducibility.
- ScoreProbabilityAlignment: Evaluates the alignment between credit scores and predicted probabilities.
Enhancements to Existing Tests: Modifications have been made to existing tests to improve their functionality and accuracy. For example, the TooManyZeroValues test now includes a row count and uses a percentage threshold for zero values.
Dataset Splitting Functionality: The split function in lending_club.py has been enhanced to support an optional validation set, allowing for more flexible dataset splitting.
Test Configuration Utility: A new utility function get_demo_test_config has been added to generate a default test configuration for demo purposes.
Version Update: The version of the ValidMind Library has been updated from 2.7.3 to 2.7.4.
Bug Fixes: Various bug fixes have been implemented, including corrections to test logic and improvements to test coverage.

Test Suggestions

Run the new Jupyter notebooks to ensure they execute without errors and produce the expected outputs.
Verify the functionality of the new tests by running them with different datasets and configurations.
Test the enhanced split function with various dataset sizes and configurations to ensure it correctly handles train, validation, and test splits.
Check the accuracy and performance of the MutualInformation and ScoreBandDefaultRates tests with known datasets.
Validate the CalibrationCurve and ClassifierThresholdOptimization tests by comparing their outputs with expected calibration and threshold optimization results.
Ensure the ModelParameters test correctly extracts parameters from different model types.
Test the ScoreProbabilityAlignment test with datasets having different score distributions.

juanmleng added 3 commits January 2, 2025 14:57

Add application scorecard with ml notebooks and new tests

4a143f2

Merge branch 'main'

30fa3c5

Update validmind install

43d4f74

juanmleng added internal Not to be externalized in the release notes DO NOT MERGE PR is not ready to be merged labels Jan 2, 2025

juanmleng requested a review from MichaelIngvarRoenning January 2, 2025 14:04

juanmleng self-assigned this Jan 2, 2025

juanmleng added 10 commits January 2, 2025 15:09

Add copyright headers

01de056

Merge branch 'main'

9396e44

Add env variable to add external context for test descriptions.

27397f5

Add ENV variable to input additional context to LLM-based test descri…

dd1dd72

…ptions

Fix unit tests

51009b6

Merge branch 'main'

7a02b59

Fix integration tests and HyperParametersTuning

baed1e6

Fix lint

3c34900

Merge branch 'main'

9ebf65d

Fix HyperParametersTuning

f2cda08

juanmleng removed the DO NOT MERGE PR is not ready to be merged label Jan 3, 2025

juanmleng requested review from AnilSorathiya, cachafla and johnwalz97 January 3, 2025 22:14

johnwalz97 approved these changes Jan 3, 2025

View reviewed changes

2.7.4

58d1a1a

juanmleng merged commit 129c33c into main Jan 3, 2025
6 checks passed

johnwalz97 deleted the juan5508/sc-7864/create-credit-risk-scorecard-notebook-using-xgboost branch January 6, 2025 16:00

cachafla added enhancement New feature or request and removed internal Not to be externalized in the release notes labels Jan 28, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[SC-7864] Create credit risk scorecard notebook using XGBoost#280

[SC-7864] Create credit risk scorecard notebook using XGBoost#280
juanmleng merged 14 commits intomainfrom
juan5508/sc-7864/create-credit-risk-scorecard-notebook-using-xgboost

juanmleng commented Jan 2, 2025 •

edited by cachafla

Loading

Uh oh!

johnwalz97 left a comment

Uh oh!

github-actions bot commented Jan 3, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

juanmleng commented Jan 2, 2025 • edited by cachafla Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Internal Notes for Reviewers

External Release Notes

Uh oh!

johnwalz97 left a comment

Choose a reason for hiding this comment

Uh oh!

github-actions bot commented Jan 3, 2025

PR Summary

Test Suggestions

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

juanmleng commented Jan 2, 2025 •

edited by cachafla

Loading